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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: Rasmussen, Grethe 

Mikkelsen, Jan Moller 
Schulein, Martin 
Patkar, Shankant A. 
Hagen, Fred 



(ii) TITLE OF INVENTION: A Cellulase Preparation Comprising an 
Endoglucanase Enzyme 



(iii) NUMBER OF SEQUENCES: 33 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Novo Nordisk of North America, Inc. 

(B) STREET: 4 05 Lexington Avenue, 64th Floor 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: United States of America 

(F) ZIP: 10174-6401 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/389,423 

(B) FILING DATE: 14-FEB-1995 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Lambiris , Elias J. 

(B) REGISTRATION NUMBER: 33,728 

(C) REFERENCE/DOCKET NUMBER: 3469.214-US 



(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 212-867-0123 

(B) TELEFAX: 212-878-9655 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Humicola insolens 

(B) STRAIN: DSM 1800 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 73 . . 924 

(ix) FEATURE: 

(A) NAME/KEY: sig_peptide 



At 



t 



4 8 

(B) LOCATION: 10 . . 72 

(ix) FEATURE : 

(A) NAME / KEY : CDS 

(B) LOCATION: 10 . . 924 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

GGATCCAAG ATG CGT TCC TCC CCC CTC CTC CCG TCC GCC GTT GTG GCC 4 8 

Met Arg Ser Ser Pro Leu Leu Pro Ser Ala Val Val Ala 
-21 -20 -15 -10 

GCC CTG CCG GTG TTG GCC CTT GCC GCT GAT GGC AGG TCC ACC CGC TAC 96 
Ala Leu Pro Val Leu Ala Leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr 
-5 15 

TGG GAC TGC TGC AAG CCT TCG TGC GGC TGG GCC AAG AAG GCT CCC GTG 144 
Trp Asp Cys Cys Lys Pro Ser Cys Gly Trp Ala Lys Lys Ala Pro Val 
10 15 20 

AAC CAG CCT GTC TTT TCC TGC AAC GCC AAC TTC CAG CGT ATC ACG GAC 192 
Asn Gin Pro Val Phe Ser Cys Asn Ala Asn Phe Gin Arg lie Thr Asp 
25 30 35 40 

TTC GAC GCC AAG TCC GGC TGC GAG CCG GGC GGT GTC GCC TAC TCG TGC 24 0 

Phe Asp Ala Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys 
45 50 55 

GCC GAC CAG ACC CCA TGG GCT GTG AAC GAC GAC TTC GCG CTC GGT TTT 288 
Ala Asp Gin Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Leu Gly Phe 
60 65 70 

GCT GCC ACC TCT ATT GCC GGC AGC AAT GAG GCG GGC TGG TGC TGC GCC 336 
Ala Ala Thr Ser lie Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala 
75 80 85 

TGC TAC GAG CTC ACC TTC ACA TCC GGT CCT GTT GCT GGC AAG AAG ATG 384 
Cys Tyr Glu Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met 
90 95 100 

GTC GTC CAG TCC ACC AGC ACT GGC GGT GAT CTT GGC AGC AAC CAC TTC 4 32 

Val Val Gin Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Phe 
105 110 115 120 

GAT CTC AAC ATC CCC GGC GGC GGC GTC GGC ATC TTC GAC GGA TGC ACT 480 
Asp Leu Asn lie Pro Gly Gly Gly Val Gly lie Phe Asp Gly Cys Thr 
125 130 135 

CCC CAG TTC GGC GGT CTG CCC GGC CAG CGC TAC GGC GGC ATC TCG TCC 52 8 

Pro Gin Phe Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly lie Ser Ser 
140 145 150 

CGC AAC GAG TGC GAT CGG TTC CCC GAC GCC CTC AAG CCC GGC TGC TAC 5 76 

Arg Asn Glu Cys Asp Arg Phe Pro Asp Ala Leu Lys Pro Gly Cys Tyr 
155 160 165 

TGG CGC TTC GAC TGG TTC AAG AAC GCC GAC AAT CCG AGC TTC AGC TTC 624 
Trp Arg Phe Asp Trp Phe Lys Asn Ala Asp Asn Pro Ser Phe Ser Phe 
170 175 180 

CGT CAG GTC CAG TGC CCA GCC GAG CTC GTC GCT CGC ACC GGA TGC CGC 6 72 

Arg Gin Val Gin Cys Pro Ala Glu Leu Val Ala Arg Thr Gly Cys Arg 
185 190 195 200 

CGC AAC GAC GAC GGC AAC TTC CCT GCC GTC CAG ATC CCC TCC AGC AGC 72 0 

Arg Asn Asp Asp Gly Asn Phe Pro Ala Val Gin lie Pro Ser Ser Ser 
205 210 215 
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ACC AGC TCT CCG GTC AAC CAG CCT ACC AGC ACC AGC ACC ACG TCC ACC 768 
Thr Ser Ser Pro Val Asn Gin Pro Thr Ser Thr Ser Thr Thr Ser Thr 
220 225 230 

TCC ACC ACC TCG AGC CCG CCA GTC CAG CCT ACG ACT CCC AGC GGC TGC 816 
Ser Thr Thr Ser Ser Pro Pro Val Gin Pro Thr Thr Pro Ser Gly Cys 
235 240 245 

ACT GCT GAG AGG TGG GCT CAG TGC GGC GGC AAT GGC TGG AGC GGC TGC 864 
Thr Ala Glu Arg Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys 
250 255 260 

ACC ACC TGC GTC GCT GGC AGC ACT TGC ACG AAG ATT AAT GAC TGG TAC 912 
Thr Thr Cys Val Ala Gly Ser Thr Cys Thr Lys lie Asn Asp Trp Tyr 
265 270 275 280 

CAT CAG TGC CTG TAGACGCAGG GCAGCTTGAG GGCCTTACTG GTGGCCGCAA 964 
His Gin Cys Leu 

CGAAATGACA CTCCCAATCA CTGTATTAGT TCTTGTACAT AATTTCGTCA TCCCTCCAGG 1024 

GATTGTCACA TAAATGCAAT GAGGAACAAT GAGTAC 1060 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 305 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Arg Ser Ser Pro Leu Leu Pro Ser Ala Val Val Ala Ala Leu Pro 
-21 -20 -15 -10 

Val Leu Ala Leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr Trp Asp Cys 
-5 15 10 

Cys Lys Pro Ser Cys Gly Trp Ala Lys Lys Ala Pro Val Asn Gin Pro 
15 20 25 

Val Phe Ser Cys Asn Ala Asn Phe Gin Arg lie Thr Asp Phe Asp Ala 
30 35 40 

Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys Ala Asp Gin 
45 50 55 

Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Leu Gly Phe Ala Ala Thr 
60 65 70 75 

Ser lie Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala Cys Tyr Glu 
80 85 90 

Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met Val Val Gin 
95 100 105 

Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Phe Asp Leu Asn 
110 115 120 

lie Pro Gly Gly Gly Val Gly lie Phe Asp Gly Cys Thr Pro Gin Phe 
125 130 135 



Gly Gly Leu Pro 
140 

Cys Asp Arg Phe 



Asp Trp Phe Lys 
175 

Gin Cys Pro Ala 
190 

Asp Gly Asn Phe 
205 

Pro Val Asn Gin 
220 

Ser Ser Pro Pro 



Arg Trp Ala Gin 
255 

Val Ala Gly Ser 
270 

Leu 




Gly Gin Arg Tyr 
145 

Pro Asp Ala Leu 
160 

Asn Ala Asp Asn 



Glu Leu Val Ala 
195 

Pro Ala Val Gin 
210 

Pro Thr Ser Thr 
225 

Val Gin Pro Thr 
240 

Cys Gly Gly Asn 



Thr Cys Thr Lys 
275 



Gly Gly lie Ser 
150 

Lys Pro Gly Cys 
165 

Pro Ser Phe Ser 
180 

Arg Thr Gly Cys 



lie Pro Ser Ser 
215 

Ser Thr Thr Ser 
230 

Thr Pro Ser Gly 
245 

Gly Trp Ser Gly 
260 

lie Asn Asp Trp 




Ser Arg Asn Glu 
155 

Tyr Trp Arg Phe 
170 

Phe Arg Gin Val 
185 

Arg Arg Asn Asp 
200 

Ser Thr Ser Ser 



Thr Ser Thr Thr 
235 

Cys Thr Ala Glu 
250 

Cys Thr Thr Cys 
265 

Tyr His Gin Cys 
280 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1473 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Fusarium oxysporum 

(B) STRAIN: DSM 2672 

(ix) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 97.. 1224 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

GAATTCGCGG CCGCTCATTC ACTTCATTCA TTCTTTAGAA TTACATACAC TCTCTTTCAA 6 0 

AACAGTCACT CTTTAAACAA AACAACTTTT GCAACA ATG CGA TCT TAC ACT CTT 114 

Met Arg Ser Tyr Thr Leu 
1 5 

CTC GCC CTG GCC GGC CCT CTC GCC GTG AGT GCT GCT TCT GGA AGC GGT 162 
Leu Ala Leu Ala Gly Pro Leu Ala Val Ser Ala Ala Ser Gly Ser Gly 
10 15 20 
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CAC TCT ACT CGA TAC TGG GAT TGC TGC AAG CCT TCT TGC TCT TGG AGC 210 

His Ser Thr Arg Tyr Trp Asp Cys Cys Lys Pro Ser Cys Ser Trp Ser 
25 30 35 

GGA AAG GCT GCT GTC AAC GCC CCT GCT TTA ACT TGT GAT AAG AAC GAC 2 58 

Gly Lys Ala Ala Val Asn Ala Pro Ala Leu Thr Cys Asp Lys Asn Asp 
40 45 50 

AAC CCC ATT TCC AAC ACC AAT GCT GTC AAC GGT TGT GAG GGT GGT GGT 3 06 

Asn Pro lie Ser Asn Thr Asn Ala Val Asn Gly Cys Glu Gly Gly Gly 
55 60 65 70 

TCT GCT TAT GCT TGC ACC AAC TAC TCT CCC TGG GCT GTC AAC GAT GAG 3 54 

Ser Ala Tyr Ala Cys Thr Asn Tyr Ser Pro Trp Ala Val Asn Asp Glu 
75 80 85 

CTT GCC TAC GGT TTC GCT GCT ACC AAG ATC TCC GGT GGC TCC GAG GCC 402 
Leu Ala Tyr Gly Phe Ala Ala Thr Lys lie Ser Gly Gly Ser Glu Ala 
90 95 100 

AGC TGG TGC TGT GCT TGC TAT GCT TTG ACC TTC ACC ACT GGC CCC GTC 450 
Ser Trp Cys Cys Ala Cys Tyr Ala Leu Thr Phe Thr Thr Gly Pro Val 
105 110 115 

AAG GGC AAG AAG ATG ATC GTC CAG TCC ACC AAC ACT GGA GGT GAT CTC 4 98 

Lys Gly Lys Lys Met lie Val Gin Ser Thr Asn Thr Gly Gly Asp Leu 
120 125 130 

GGC GAC AAC CAC TTC GAT CTC ATG ATG CCC GGC GGT GGT GTC GGT ATC 54 6 

Gly Asp Asn His Phe Asp Leu Met Met Pro Gly Gly Gly Val Gly lie 
135 140 145 150 

TTC GAC GGC TGC ACC TCT GAG TTC GGC AAG GCT CTC GGC GGT GCC CAG 5 94 

Phe Asp Gly Cys Thr Ser Glu Phe Gly Lys Ala Leu Gly Gly Ala Gin 
155 160 165 

TAC GGC GGT ATC TCC TCC CGA AGC GAA TGT GAT AGC TAC CCC GAG CTT 642 
Tyr Gly Gly lie Ser Ser Arg Ser Glu Cys Asp Ser Tyr Pro Glu Leu 
170 175 180 

CTC AAG GAC GGT TGC CAC TGG CGA TTC GAC TGG TTC GAG AAC GCC GAC 6 90 

Leu Lys Asp Gly Cys His Trp Arg Phe Asp Trp Phe Glu Asn Ala Asp 
185 190 195 

AAC CCT GAC TTC ACC TTT GAG CAG GTT CAG TGC CCC AAG GCT CTC CTC 73 8 

Asn Pro Asp Phe Thr Phe Glu Gin Val Gin Cys Pro Lys Ala Leu Leu 
200 205 210 

GAC ATC AGT GGA TGC AAG CGT GAT GAC GAC TCC AGC TTC CCT GCC TTC 786 
Asp lie Ser Gly Cys Lys Arg Asp Asp Asp Ser Ser Phe Pro Ala Phe 
215 220 225 230 

AAG GTT GAT ACC TCG GCC AGC AAG CCC CAG CCC TCC AGC TCC GCT AAG 834 
Lys Val Asp Thr Ser Ala Ser Lys Pro Gin Pro Ser Ser Ser Ala Lys 
235 240 245 

AAG ACC ACC TCC GCT GCT GCT GCC GCT CAG CCC CAG AAG ACC AAG GAT 882 
Lys Thr Thr Ser Ala Ala Ala Ala Ala Gin Pro Gin Lys Thr Lys Asp 
250 255 260 

TCC GCT CCT GTT GTC CAG AAG TCC TCC ACC AAG CCT GCC GCT CAG CCC 93 0 

Ser Ala Pro Val Val Gin Lys Ser Ser Thr Lys Pro Ala Ala Gin Pro 
265 270 275 

GAG CCT ACT AAG CCC GCC GAC AAG CCC CAG ACC GAC AAG CCT GTC GCC 978 
Glu Pro Thr Lys Pro Ala Asp Lys Pro Gin Thr Asp Lys Pro Val Ala 
280 285 290 
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ACC AAG CCT GCT GCT ACC AAG CCC GTC CAA CCT GTC AAC AAG CCC AAG 102 6 

Thr Lys Pro Ala Ala Thr Lys Pro Val Gin Pro Val Asn Lys Pro Lys 
295 300 305 310 

ACA ACC CAG AAG GTC CGT GGA ACC AAA ACC CGA GGA AGC TGC CCG GCC 1074 
Thr Thr Gin Lys Val Arg Gly Thr Lys Thr Arg Gly Ser Cys Pro Ala 
315 320 325 

AAG ACT GAC GCT ACC GCC AAG GCC TCC GTT GTC CCT GCT TAT TAC CAG 1122 
Lys Thr Asp Ala Thr Ala Lys Ala Ser Val Val Pro Ala Tyr Tyr Gin 
330 335 340 

TGT GGT GGT TCC AAG TCC GCT TAT CCC AAC GGC AAC CTC GCT TGC GCT 117 0 

Cys Gly Gly Ser Lys Ser Ala Tyr Pro Asn Gly Asn Leu Ala Cys Ala 
345 350 355 

ACT GGA AGC AAG TGT GTC AAG CAG AAC GAG TAC TAC TCC CAG TGT GTC 1218 
Thr Gly Ser Lys Cys Val Lys Gin Asn Glu Tyr Tyr Ser Gin Cys Val 
360 365 370 

CCC AAC TAAATGGTAG ATCCATCGGT TGTGGAAGAG ACTATGCGTC TCAGAAGGGA 1274 

Pro Asn 

375 

TCCTCTCATG AGCAGGCTTG TCATTGTATA GCATGGCATC CTGGACCAAG TGTTCGACCC 1334 

TTGTTGTACA TAGTATATCT TCATTGTATA TATTTAGACA CATAGATAGC CTCTTGTCAG 13 94 

CGACAACTGG CTACAAAAGA CTTGGCAGGC TTGTTCAATA TTGACACAGT TTCCTCCATA 14 54 

AAAAAAAAAA AAAAAAAAA 1473 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 76 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Arg Ser Tyr Thr Leu Leu Ala Leu Ala Gly Pro Leu Ala Val Ser 
15 10 15 

Ala Ala Ser Gly Ser Gly His Ser Thr Arg Tyr Trp Asp Cys Cys Lys 
20 25 30 

Pro Ser Cys Ser Trp Ser Gly Lys Ala Ala Val Asn Ala Pro Ala Leu 
35 40 45 

Thr Cys Asp Lys Asn Asp Asn Pro lie Ser Asn Thr Asn Ala Val Asn 
50 55 60 

Gly Cys Glu Gly Gly Gly Ser Ala Tyr Ala Cys Thr Asn Tyr Ser Pro 
65 70 75 80 

Trp Ala Val Asn Asp Glu Leu Ala Tyr Gly Phe Ala Ala Thr Lys lie 
85 90 95 

Ser Gly Gly Ser Glu Ala Ser Trp Cys Cys Ala Cys Tyr Ala Leu Thr 
100 105 110 

Phe Thr Thr Gly Pro Val Lys Gly Lys Lys Met lie Val Gin Ser Thr 
115 120 125 
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Asn Thr Gly Gly Asp Leu Gly Asp Asn His Phe Asp Leu Met Met Pro 
130 135 140 

Gly Gly Gly Val Gly lie Phe Asp Gly Cys Thr Ser Glu Phe Gly Lys 
145 150 155 160 

Ala Leu Gly Gly Ala Gin Tyr Gly Gly lie Ser Ser Arg Ser Glu Cys 
165 170 175 

Asp Ser Tyr Pro Glu Leu Leu Lys Asp Gly Cys His Trp Arg Phe Asp 
180 185 190 

Trp Phe Glu Asn Ala Asp Asn Pro Asp Phe Thr Phe Glu Gin Val Gin 
195 200 205 

Cys Pro Lys Ala Leu Leu Asp lie Ser Gly Cys Lys Arg Asp Asp Asp 
210 215 220 

Ser Ser Phe Pro Ala Phe Lys Val Asp Thr Ser Ala Ser Lys Pro Gin 
225 230 235 240 

Pro Ser Ser Ser Ala Lys Lys Thr Thr Ser Ala Ala Ala Ala Ala Gin 
245 250 255 

Pro Gin Lys Thr Lys Asp Ser Ala Pro Val Val Gin Lys Ser Ser Thr 
260 265 270 

Lys Pro Ala Ala Gin Pro Glu Pro Thr Lys Pro Ala Asp Lys Pro Gin 
275 280 285 

Thr Asp Lys Pro Val Ala Thr Lys Pro Ala Ala Thr Lys Pro Val Gin 
290 295 300 

Pro Val Asn Lys Pro Lys Thr Thr Gin Lys Val Arg Gly Thr Lys Thr 
305 310 315 320 

Arg Gly Ser Cys Pro Ala Lys Thr Asp Ala Thr Ala Lys Ala Ser Val 
325 330 335 

Val Pro Ala Tyr Tyr Gin Cys Gly Gly Ser Lys Ser Ala Tyr Pro Asn 
340 345 350 

Gly Asn Leu Ala Cys Ala Thr Gly Ser Lys Cys Val Lys Gin Asn Glu 
355 360 9 365 

Tyr Tyr Ser Gin Cys Val Pro Asn 
370 375 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
AGCTGCGGCC GCAGGCCGCG GAGGCCA 2 7 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 
AGCTTGGCCT CCGCGGCCTG CGGCCGC 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 
AATTCGCGGC CGCGGCCATG GAGGCC 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 
AATTGGCCTC CATGGCCGCG GCCGCG 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 
AAYGCYGACA AAYCC 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



WO 91/17243 
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SEQUENCE LISTING 



10 



15 



20 



25 



30 



35 



(1) GENERAL INTOEMATION: 

(i) APPLICANT: NOVO NORDISK A/S, N N 
(ii) TITLE OF INVENTION: A Cellulase Preparation 
(iii) NUMBER OF SEQUENCES: 4 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: NOVO NORDISK A/S, Pat 

(B) STREET: Novo Alle 

(C) CITY: Bagsvaerd 

(E) COUNTRY: DENMARK 

(F) ZIP: DK-2880 



It Department 



_ Cv) COMPUTER. READABLE FOFM 

(A) "MEDIUM TYPE: Floppy dls, 
. (B) COMPUTER: IBM PC 

(C) OPERATING SYSTEM 

(D) SOFIWARE: Patentln 

(vi) CURRENT APPLICATION 

(A) APPLICATION 

(B) FILING DATE: 

(C) CLASSIFICATIG 



(viii) ATIORNEY/AGENT /INFORMATION : 

(A) NAME: Ih^dsoe-Madsen , Birgit 

(ix) TELECXM^OraCATION INFORMATION: 

(A) TELEj?HONE: +45 4444 8888 

(B) TFTFfFAX: +45 4449 3256 

(C) TF/FX: 37304 



ible 
l/MS-DOS 

#1.0, Version #1.25 



(2) INFOF 



EON FOR SEQ ID NO:l: 



40 



45 



50 



(i) SEQUENCE CHARACTERISTICS: 
_/(A) LENGTH: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

'(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Humicola insolens 

(B) STRAIN: DSM 1800 



WO 91/17243 
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(ix) FEATORE: 

(A) NAME/KEY: irat_peptide 

(B) LOCATION: 73 ,.927 

5 (ix) FEATURE: 

(A) NAME/KEY: sig_peptide 

(B) IDCATTQN: 10.. 72 

(ix) FEAIURE: 
10 (A) NAME/KEY: CDS 

(B) LDCATION: 10. .927 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GGATOCAAG ATG OCT TCC TCC CCC CTC CTC COG TCC GCC GTT GIG GCC 
Met Arg Ser Ser Pro Leu Leu Pro Ser Ala Val Val Ala 
-21 -20 -15 -io 

2 0 GCC CIG COG GIG TIG GCC CTT GCC GCT GAT GGC AGG TCC ACC CGC TAC 
Ala Leu Pro Val Leu Ala leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr 

"5 . : . 1 5 



48 



96 



TGG GAC TGC TGC AAG CCT TOG TGC GGC TGG GCC AAG AAG GCT CCC GIG 144 

2 5 Trp Asp Cys Cys Lys Pro Ser Cys Gly Trp Ala Lys Lys Ala Pro Val 

10 15 20 

AAC CAG OCT GTC ITT TCC TGC AAC GCC AAC TIC CAG OGT ATC AOG GAC 192 
Asn Gin Pro Val Fhe Ser Cys Asn Ala Asn Phe Gin Ara He Thr Asp 
30 25 30 35 40 

TTC GAC GCC AAG TCC GGC TGC GAG COG GGC GGT GTC GCC TAC TOG TCC 240 
Fhe Asp Ala Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys 
45 50 • . 55 

3 5 

GCC GAC CAG ACC CCA TGG GCT GIG AAC GAC GAC TTC GOG CTC GGT TTT 288 
Ala Asp Gin Thr Pro Trp Ala Val Asn Asp Asp Phe Ala leu Gly Phe 
60 65 70 

4 0 GCT GCC ACC TCP ATT GCC GGC AGC AAT GAG GOG GGC TGG TGC TGC GCC 336 

Ala Ala Thr Ser He Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala 
75 80 85 

TGC TAC GAG CTC ACC TTC ACA TCC GGT CCT GIT GCT GGC AAG AAG ATC 384 
4 5 Cys Tyr Glu Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met 
90 95 100 

GTC GTC CAG TCC ACC AGC ACT GGC GGT GAT CTT GGC AGC AAC CAC TTC 432 
Val Val Gin Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Fhe 
50 105 no us 120 

GAT CTC AAC ATC CCC GGC GGC GGC GTC GGC ATC TTC GAC GGA TGC ACT 480 
Asp Leu Asn He Pro Gly Gly Gly Val Gly He Phe Asp Gly Cys Thr 
125 130 135 

55 



WO 91/17243 



IPCI7DK91/00123 



49 



CCC CAG TTC GGC GGT CIG CCC GGC CAG OGC TAC GGC GGC ATC TOG TCC 528 
Pro Gin Phe Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly lie Ser Ser 
140 145 150 

5 OGC AAC GAG TCC GAT OGG TTC CCC GAC GCC CTC AAG CCC GGC TCC TAC 576 
Arg Asn Glu Cys Asp Arg Phe Pro Asp Ala Leu Lys Pro Gly Cys Tyr 
155 160 165 

TCG OGC TTC GAC TGG TTC AAG AAC GCC GAC AAT COG AGC TTC AGO TTC 624 
10 Trp Arg Phe Asp Trp Phe Lys Asn Ala Asp Asn Pro Ser Phe Ser Phe 
170 175 180 

OCT CAG GTC CAG TCC CCA GCC GAG CTC GTC GOT OGC ACC GGA TCC OGC 672 
Arg Gin Val Gin Cys Pro Ala Glu Leu Val Ala Arg Thr Gly Cys Arg 
15 185 190 195 200 

OGC AAC GAC GAC GGC AAC TTC CCT GCC GTC CAG ATC CCC TCC AGC AGC 720 
Arg Asn Asp Asp Gly Asn Phe Pro Ala Val Gin lie Pro Ser Ser Ser 
205 210 215 

20 

ACC AGC TCP COG GTC AAC CAG CCT ACC AGC ACC AGC ACC AOG TCC ACC 768 
Thr Ser Ser Pro Val Asn Gin Pro Thr Ser Thr Ser Thr Thr Ser Thr 
220 225 230 

2 5 TCC ACC ACC TOG AGC COG CCA GTC . CAG CCT AOG ACT CCC AGC GGC TCC 816 

Ser Thr Thr Ser Ser Pro Pro Val Gin Pro Thr Thr Pro Ser Gly Cys 
235 240 245 

ACT GOT GAG AGG TCG GCT CAG TCC GGC GGC AAT GGC TCG AGC GGC TCC 864 

3 0 Thr Ala Glu Arg Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys 

250 - . 255 260 

ACC ACC TCC GTC GCT GGC AGC ACT TCC AOG AAG ATT AAT GAC TCG TAC 912 
Thr Thr Cys Val Ala Gly Ser Thr Cys Thr Lys lie Asn Asp Trp Tyr ■ A 

35 265 270 275 280 

CAT CAG TCC CIG TAGAOGCAGG GCAGCTTGAG GGOCTTACTC GTCGCOGCAA - - 964 

His Gin Cys Leu 

285 

4 0 

OGAAATCACA CTCCCAATCA CICTATTAGT TCITCTACAT AATTTOGTCA TCCCTCCAGG 1024 
GATTCTCACA TAAATCCAAT GAGGAACAAT GAGTAC 1060 



45 
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(2) INFORMATION FOR SBQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
. ■ ' ' • ": (A) LENGTH: 305 amino acids 
5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 SEQUENCE DESCRIPTION : SBQ ID NO: 2: 

Met ^ ser Ser Pro Leu I^u Pro Ser Ala Val Val Ala Ala Leu Pro 

15 Val leu Ala Leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr Trp Asp Cys 

1 5 10 

Cys Lys Pro Ser Cys Gly Trp Ala Lys Lys Ala' Pro Val Asn Gin Pro 



20 



20 25 



Val Phe Ser Cys Asn Ala Asn Phe Gin Arg lie Thr Asp Phe Asp Ala 

35 40 

25 ^ ^ GlY ^ G1U ^..^ G1 y Val Ala.lyr Ser Cys Ala Asp Gin 
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Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Leu Gly Phe Ala Ala Thr 

" .... • . .. '. 70 ;■ ; 75 

30 Ser lie Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala Cys Tyr Glu 

80 ... . . .85 90 

Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met Val Val Gin 
35 95 100 . 105 

• Ser Thr ser Thr Gly Gly Asp Leu Gly Ser Asn His Phe Asp Leu Asn ■ 

115 120 

40 2? GlY GlyGly Val ?£. Ile **> Asp Gly cys Thr Pro Gin Phe 

130 ' 135 • -•" \:>v 

Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly lie Ser Ser Arg Asn Glu " 

A4i> 150 155 

4 5 Cys Asp Arg Phe Pro Asp Ala Leu Lys Pro Gly Cys Tyr Trp Arg Phe 

160 165 170 

Asp Trp Phe Lys Asn Ala Asp Asn Pro Ser Phe Ser Phe Arg Gin Val 
50 175 180 i 8 5 

Gin Cys Pro Ala Glu Leu Val Ala Arg Thr Gly Cys Arg Arg Asn Asp 



200 



55 Ss ^ ^ ^ Ma ^ Ser ^ Ser Thr Ser Ser 



215 
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Pro Val Asn Gin Pro Thr Ser Thr Ser Thr Thr Ser Thr Ser Thr Thr 
220 225 230 235 

5 Ser Ser Pro Pro Val Gin Pro Thr Ihr Pro Ser Gly Cys Thr Ala Glu 

240 245 250 



10 



Arg Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys Thr Thr Cys 
255 260 265 

Val Ala Gly Ser Thr Cys Thr Lys lie Asn Asp Trp Tyr His Gin Cys 
270 275 280 • 

Leu 



15 



WO 91/17243 



52 



PCT/DK9 1/00 123 
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(2) DEFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1473 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO - 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Fusarium oxysporum 

(B) STRAIN: DSM 2672 

(ix) FEATURE: 
20 (A) NAME/KEY: CDS 

(B) LOCATION: 97.. 1224 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GAATICGCGG COGCICATTC ACTTCATTCA TTCTTTAGAA TTACATACAC TCICITTCAA 
AACAGTCACT CTITAAACAA AACAACITIT GCAACA AUG CGA TCT TAC ACT CTT 

30 mt ^ Ser Tyr Thr Leu 

1 5 

tTt^"" GCC CCT CTC GCC GTG ACT GCT TCT GGA AGC GGT 

Leu Ala Leu Ala Gly Pro Leu Ala Val Ser Ala Ala Ser Gly Ser Gly 

35 15 20 

CAC TCT ACT CGA TAC TGG GAT TGC TGC AAG OCT TCT TGC TCT TGG AGC 

HxsSerl^ArgiyrTrpAspcysCysLysProSerC^SerTSS 
25 30 35 

40 f55 ST Iff ^ 001 601 ™ ACT TOT GAT AAG AAC GAC 

Gly Lys Ala Ala Val Asn Ala Pro Ala Leu Thr Cys Asp Lys Asn Asp 
* u 45 50 

. ,_ J^*" ATT ^CC AAC ACC AAT GCT GTC AAC GGT TGT GAG GGT GGT GGT 

45 Asn Pro lie Ser Asn Thr Asn Ala Val Asn Gly Cys Ify Sy Sy 

60 65 7 o 

Ser Ala Tvr- at"^ rs^" »r?" TCT CCC TGG GCT GTC AAC GAT GAG 

50 Ser ^ a ^ Ala Cys Tnr Asn Tyr Ser Pro Trp Ala Val Asn Asp Glu 

75 80 85 

SS^G^^f f ^^^^^^^^^ 
Leu Ala Tyr Gly Phe Ala Ala Thr Lys lie Ser Gly Gly Ser Glu Ala 

55 90 95 100 



60 
114 

162 

210 

258 

306 

354 

402 
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Ser Trp Cys Cys Ala Cys Tyr Ala "Leu Thr H,e ,£ ^ ~ SE 

110 115 

125 130 

145 150 

i Phe Asp Gly Cys ^ ^ Glu ^ Qy Lys Ma S Sly G^ Sn 
„ 160 165 

3 gS S S° f* GAA TGT GAT AGC TAC CCC GAG CTT 

t ^ Y Gly So Ser Ser ^ ^ Glu ^ S ^ Ser Tyr Pro Glu lau 
^ 2 0 175 180 

S ^ JA C GCT TCC CAC TGG OGA TTC (^C TGG TTC GAG AAC GCC r^r- 

i ^^^^^^^ArgPheAspTrppSSuA^A^A^ 

IB 190 195 

L. 205 210 



GAC ATC ACT GGA TGC 
fi 30 Asp He Ser Gly Cys 
215 

AAG GTT GAT AOC TOG 
Lys Val Asp Thr Ser 
35 235 

AAG ACC ACC TCC GCT 
Lys Thr Thr Ser Ala 
250 



40 



TCC GCT CCT GTT GTC 
Ser Ala Pro Val Val 
265 



AAG CGT GAT GAC GAC TCC AGC TTC CCT GCC TTC 
lys Arg Asp Asp Asp Ser Ser Phe Pro Ala Phe 
220 225 230 

GCC AGC AAG CCC CAG CCC TCC AGC TCC GCT AAG 
Ala Ser Lys Pro Gin Pro Ser Ser Ser Ala Lvs 
240 245 

GCT GCT GCC GCT CAG CCC CAG AAG ACC AAG GAT 
Ala Ala Ala Ala Gin Pro Gin Lys Thr Lys Asp 
255 260 

CAG AAG TCC TCC ACC AAG CCT GCC GCT CAG CCC 
Gin Lys Ser Ser Thr Lys Pro Ala Ala Gin Pro 
270 275 



450' 



498 



546 



594 



642 



690 



738 



786 



45 G^^^^^^^ C ^^^ACCGACAAGCCTGTCGCC 
Glu Pro Thr Lys Pro Ala Asp L ys Pro Gin Tnr Asp Lys Pr£ ?S £a 

290 

305 310 



834 



882 



930 



978 



1026 



1074 



320 



325 
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335 340 
5 TCT GGT GCT TOC AAG TOC GCT TAT CCC AAC anr ^ 

355 

Jb5 370 
Rps Act AICCAIOSCT TCTGGftAGAG TCAGAAGGGA 



15 375 



2 0 



TTCITGTACA Ti^^ TC^TCI^TA TATTEAGACa'' C^TAGATAGC CICITGTCAG 
CCACAACIGG CTACAAAAGA CITGGCAGGC Tlt^TCAATA TTGACACAGT TTCCTCCATA 
AAAAAAAAAA AAAAAAAAA : • 



1122 

1170 

1218 

1274 

1334 
1394 
1454 
1473 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



AACGAYGAYG GNAAYTTCCC 



20 



(2) INFORMATION FOR SEQ ID NO : 11 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AAYGAYTGGT ACCAYCARTG 2 0 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GCGCCAGTAG CAGCCGGGCT TGAGGG 2 6 



(2) INFORMATION FOR SEQ ID NO:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
ACGTCTCAAC TCGGATCCAA GATGCGTT 2 8 



(2) INFORMATION FOR SEQ ID NO : 14 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CTCAACTCTG ATCAAGATGC GTTCC 25 



(2) INFORMATION FOR SEQ ID NO: 15: 
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(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 6 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TGTCGACCAG TAAGGCCCTC AAGCTG 26 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GACAGAGCAC AGAATTCACT AGTGAGCTCT 3 0 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TGGGAYTGYT GYAARCC 17 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 
AGGGAGACCG GAATTCTGGG AYTGYTGYAA RCC 3 3 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 
C CNGGNGGNG GNGTNGG 



(2) INFORMATION FOR SEQ ID NO: 20: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
AGGGAGACCG GAATTCCCNG GNGGNGGNGT NGG 



(2) INFORMATION FOR- SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
ACNAYCATNK TYTTNCC 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 
G AC AGAG C AC AGAATTCACN AYCATNKTYT TNCC 



(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
NGGRTTRTCN GCNKYYTYRA ACCA 



(2) INFORMATION FOR SEQ ID NO:24: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GACAGAGCAC AGAATTCNGG RTTRTCNGCN KYYTYRAACC A 41 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GGGGTAGCTA TCACATTCGC TTCGGGAGGA GATACCGCCG TA 42 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CTTCTTGCTC TTGGAGCGGA AAGGCTGCTG TCAACGCCCC TG 42 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TGTACGCATG TAACATTA 18 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
CTGCACAATA TTTCAAGC 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
GGGGTAGCTA TCACATTCGC TTCGGGAGGA GATACCGCCG TA 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CTTCTTGCTC TTGGAGCGGA AAGGCTGCTG TCAACGCCCC TG 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
AGCTTCTCAA GGACGGTT 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 
AACAAGGGTC GAACACTT 



(2) INFORMATION FOR SEQ ID NO: 33: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 



CCAGAAGACC AAGGATT 



WO 91/17243 



PCI7DK91/00123 
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(2) INTOEMATTON FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) IENCTH: 376 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Arg Ser Tyr Thr Leu leu Ala Leu Ala Gly Pro Leu Ala Val Ser 
15 10 15 

15 

Ala Ala Ser Gly Ser Gly His Ser Thr Arg Tyr Tr P Asp Cys cys Lys 
20 25 30 

Pro Ser Cys Ser Trp Ser Gly lys Ala Ala Val Asn Ala Pro Ala Leu 
20 35 40 45 

Thr cys Asp lys Asn Asp Asn Pro lie Ser Asn Thr Asn Ala Val Asn 
50 55 60 

25 Gly Cys Glu Gly Gly Gly Ser Ala Tyr Ala Cys Thr Asn Tyr Ser Pro 
65 70 75 80 

Trp Ala Val Asn Asp Glu leu Ala Tyr Gly Phe Ala Ala Thr Lys lie 
85 90 95 

30 

Ser Gly Gly Ser Glu Ala Ser Trp Cys cys Ala cys Tyr Ala leu Thr 
100 105 110 

Phe Thr Thr Gly Pro Val lys Gly Lys Lys Met lie Val Gin Ser Thr 
35 115 120 125 

Asn Thr Gly Gly Asp Leu Gly Asp Asn His Phe Asp Leu Met Met Pro 
130 135 140 

40 Gly Gly Gly Val Gly lie Phe Asp Gly Cys Thr Ser Glu Phe Gly Lys 
145 150 155 160 

Ala Leu Gly Gly Ala Gin Tyr Gly Gly He Ser Ser Arg Ser Glu Cys 
165 170 175 

45 

Asp Ser Tyr Pro Glu Leu Leu Lys Asp Gly Cys His Trp Arg Phe Asp 
180 185 190 

Trp Phe Glu Asn Ala Asp Asn Pro Asp Phe Thr Phe Glu Gin Val Gin 
50 195 200 205 

cys Pro Lys Ala leu Leu Asp He Ser Gly Cys lys Arg Asp Asp Asp 
210 215 220 

55 Ser Ser Phe Pro Ala Phe Lys Val Asp Thr Ser Ala Ser Lys Pro Gin 
225 230 235 240 



WO 91/17243 ''0^ 

56 



PCT/DK91/00123 



Pro Ser Ser Ser Ala Lys Lys Thr Urr Ser 
245 



Ala Ala Ala Kk£ Ala Gin 
250 X 255 



5 Pro Gin Lys Hir Lys Asp Ser Ala Pro Val Val GyAjys Ser Ser 
260 — — / 



265 



270 



Lys Pro Ala Ala Gin Pro Glu Pro Ihr Lys^o Ala Asp Lys Pro 



10 



280 



285 



Thr 



Gin 



Thr Asp Lys Pro Val Ala Thr Lys^Ala Ala Thr Lys Pro Val Gin 



295 



300 



to Val Asn Lys Pro Lys T^4hr Gin Lys Val Arg Gly Thr Lys Thr 



15' 305 



310 



3 15 320 
Arg Gly Ser Cys Pro^a Lys Thr Asp Ala Thr Ala Lys Ala 



3 ^> 330 
20 Val Pro Ala ^ Tyr Gin Cys Gly Gly Ser Lys Ser Ala Tyr Pro 



Ser Val 
335 



345 



Asn 



.350 



Gly Asn^eu Ala Cys Ala Thr Gly Ser Lys Cys Val Lys Gin Asn 



25 



° 360 355 

Tyr Ser Gin Cys Val Pro Asn 
370 375 



